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AE000479 10934 bp DNA linear BCT 01 -DEC-2000 

Escherichia coli K12 MG1655 section 369 of 400 of the complete 
genome. 

AE000479 U00096 
AE000479. 1 

■ 

Escher ichi a col i K12 
Escherichia col i K12 

Bacteria; Proteobacter ia; Gammaproteobacter i a; Enterobacteriales; 
Enterobacter iaceae; Escherichia. 

1 (bases 1 to 10934) 

Blattner, F.R., Plunkett, G. Ill, Bloch,C.A., Perna,N.T., Burland,V., 
Riley, M. , Col lado-Vides, J. , Gl asner, J. D. , Rode,C.K., Mayhew.G.F., 
GregorJ., Davis, N.W., Ki rkpatr i ck, H. A. , Goeden,M.A., Rose,D.J. f 
Mau,B. and Shao,Y. 

The complete genome sequence of Escherichia coli K-12 

Science 277 (5331), 1453-1474 (1997) 

97426617 

9278503 

2 (bases 1 to 10934) 
Blattner, F.R. 
Direct Submission 

Submitted (16-JAN-1 997) Guy Plunkett III, Laboratory of Genetics, 
University of Wisconsin, 445 Henry Mall, Madison, Wl 53706, USA. 
Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 
608-263-7459 

3 (bases 1 to 10934) 
Blattner, F.R. 
Direct Submission 

Submitted (02-SEP-1 997) Guy Plunkett Ml, Laboratory of Genetics, 
University of Wisconsin, 445 Henry Mall, Madison, Wl 53706, USA. 
Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 
608-263-7459 

4 (bases 1 to 10934) 
Plunkett, G. III. 

Di rect Submi ss ion 

Submitted (1 3-0CT-1 998) Laboratory of Genetics, University of 
Wisconsin, 445 Henry Mall, Madison, Wl 53706, USA 
On Sep 9, 1997 this sequence version replaced g i : 1 790489. 
This sequence was determined by the E. coli Genome Project at the 
University of Wisconsin-Madison (Frederick R. Blattner, director). 
Supported by NIH grants HG00301 and HG01428 (from the Human Genome 
Project and NCHGR). The entire sequence was independently 
determined from E. coli K12 strain MG1655. Predicted open reading 
frames were determined using GeneMark software, kindly supplied by 
Mark Borodovsky, Georgia Institute of Technology, Atlanta, GA, 
30332 [e-mail: mark@amber.gatech.edu]. Open reading frames that 
have been correlated with genetic loci are being annotated with CG 
Site Nos., unique ID nos. for the genes in the E. coli Genetic 
Stock Center (CGSC) database at Yale University, kindly supplied by 
Mary Berlyn. A public version of the database is accessible 
(http://cgsc.biology.yale.edu). Annotation of the genome is an 
ongoing task whose goal is to make the genome sequence more useful 
by correlating it with other data. Comments to the authors are 
appreciated. Updated information will be available at the E. coli 
Genome Project's World Wide Web site 

(http://www.genetics.wisc.edu). *** The E. coli K12 sequence and 
its annotations are periodically updated; this is version M54. No 
sequence changes. Annotation updates: updated gene identifications 
and products; all new functional assignments courtesy of Monica 
Riley; added promoters, protein binding sites, and repeated 
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sequences described in reference 1. The unique numeric identifiers 
beginning with a lowercase ' b' assigned to each gene (protein- or 
RNA-encodi ng) are now designated as gene synonyms instead of 
labels. This should allow them to be searched for in Entrez as gene 
names. 

FEATURES Locat ion/Quat i tiers 

source 1.. 10934 

/organi sm="Escher ichia coli K12" 

/strain="K12 n 

/sub_strain=" MG1655" 

/db_xref= n taxon:83333" 
protein _ bind 118.. 1 40 

/note=" central position to predicted promoter: -202.5" 

/boundjno iety="GlpR predicted site" 
promoter 298. .326 

/note=" factor Sigma70; predicted +1 start at 4266933" 
protein bind 310. .337 

/note="central position to predicted promoter: -8" 

/bound_moiety="DeoR predicted site" 
gene 393.. 1106 

/gene=" aphA" 

/note=" synonym: b4055" 
CDS 393.. 1106 

/gene=*gfphA" 

/function-"enzyme; Central intermediary metabolism: 
Nucleotide hydrolysis" 

/note=" o237 ; sequence change joins two ORFs relative to 
earlier version; 99.2 pet identical to to the conceptual 
0RF YJBP_EC0LI SW:P32697" 
/codon_start=1 
/trans l_tab I e=1 1 

/product="diadenos i ne tet raphosphatase" 
/protein_id="AAC77025. 1" 
/db_xref="GI:236734r 

/trans I at ion="MRKITQAI SAVCLLFALNSSAVALASSPSPLNPGTNVARLAEQA 
P I HWVSVAQ I ENSLAGRPPMAVGFD I DDTVLFSSPGFWRGKKTFSPESEDYLKNPVFW 
EKMNNGWDEFS I PKEVARQL I DMHVRRGDA I FFVTGRSPTKTETVSKTLADNFH I PAT 
NMNPV I FAGDKPGQNTKSQWLQDKN I R I FYGDSDND I TAARDVGARG I R I LRASNSTY 
KPLPQAGAFGEEV I VNSEY" 

promoter 1174.. 1202 

/note=" factor Sigma70; predicted +1 start at 4267809" 

gene 1217.. 1633 

/gene=" yjbQ" 
/note=" synonym: b4056" 

CDS 1217. .1633 

/gene= n yjbQ" 

/function="orf ; Unknown" 

/note="o138; 100 pet identical amino acid sequence and 
equal length to YJBQ_EC0LI SW: P32698" 
/codon_star t=1 
/trans l_t ab 1 e=1 1 

/product="orf , hypothetical protein" 
/protein_id="AAC77026. 1" 
/db_xref="GI: 1790491" 

/trans I at ion=" MWYQKTLTLSAKSRGFHLVTDE I LNQLADMPRVN I GLLHLLLQH 

TSASLTLNENCDPTVRHDMERFFLRTVPDNGNYEHDYEGADDMPSHIKSSMLGTSLVL 

PVHKGR I QTGTWQG I WLGEHR I HGGSRR I I ATLQGE" 
gene 1637. .1993 

/gene="yjbR" 

/no te=" synonym: b4057" 
CDS 1637. .1993 

/gene="yjbR" 

/f unct i on= w orf ; Unknown" 

/note-" o1 1 8 ; 100 pet identical amino acid sequence and 
equal length to YJBR_EC0LI SW: P32699" 
/codon_start=1 
/trans I tab! e=1 1 

/product=" orf , hypothetical protein" 
/protein_Jd="AAC77027. 1" 
/db_xref="GI :2367342" 

/ 1 r ans I a t i on=" MT I SELLQYCMAKPGAEQSVHNDWKATQ I KVEDVLFAMVKEVEN 
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RPAVSLKTSPELAELLRQQHSDVRPSRHLNKAHWSTVYLDGSLPDSQIYYLVDASYQQ 

AVNLLPEEKRKLLVQL" 
gene complement (2028. . 4850) 

/gene=" uvrA" 

/note=" synonym: b4058" 
CDS complement (2028. .4850) 

/gene=" uvrA" 

/function="enzyme; DNA- replication, repair, 
restr i ct ion/mod i f i cat ion" 

/note="o940; 99 pet identical amino acid sequence and 
equal length to UVRA_ECOLI SW: P07671 ; CG Site No. 21" 
/codon_start=1 
/trans l_table=1 1 

/product=" exc i s ion nuclease subunit A" 
/protein_id= n AAC77028. 1" 
/db xref="GI :2367343" 

/ 1 rans I a t i on=" MDK I EVRGARTHNLKN I NLV I PRDKL I VVTGLSGSGKSSLAFDT 
LYAEGQRRYVESLSAYARQFLSLMEKPDVDH I EGLSPA I S I EQKSTSHNPRSTVGT I T 
E I HDYLRLLFARVGEPRCPDHDVPLAAQTVSQMVDNVLSQPEGKRLMLLAP 1 1 KERKG 
EHTKTLENLASQGY I RAR I DGEVCDLSDPPKLELQKKHT 1 EVVVDRFKVRDDLTQRLA 
ESFETALELSGGTAVVADMDDPKAEELLFSANFACP I CGYSMRELEPRLFSFNNPAGA 
CPTCDGLGVQQYFDPDRV I QNPELSLAGGAI RGWDRRNFYYFQMLKSLADHYKFDVEA 
PWGSLSANVHK VVLYGSGKEN I EFKYMNDRGDTS I RRHPFEGVLHNMERRYKETESSA 
VREELAKF I SNRPCASCEGTRLRREARHVYVENTPLPA I SDMS I GHAMEFFNNLKLAG 
QRAK I AEK I LKE I GDRLKFL VNVGLNYLTLSRSAETLSGGEAQR I RLASQ I GAGLVGV 
MYVLDEPS I GLHQRDNERLLGTL I HLRDLGNTV I VVEHDEDA I RAADHV I D ! GPGAGV 
HGGEVVAEGPLEA I MAVPESLTGQYMSGKRK I EVPKKRVPANPEKVLKLTGARGNNLK 
DVTLTLPVGLFTC I TGVSGSGKSTL I NDTLFP I AQRQLNGAT I AEPAPYRD I QGLEHF 
DKVIDI DQSPI GRTPRSNPATYTGVFTPVRELFAGVPESRARGYTPGRFSFNVRGGRC 
EACQGDGV I KVEMHFLPD I YVPCDQCKGKRYNRETLE I KYKGKT I HEVLDMT I EEARE 
FFDAVPALARKLQTLMDVGLTY I RLGQSATTLSGGEAQRVKLARELSKRGTGQTLY 1 L 
DEPTTGLHFAD I QQLLDVLHKLRDQGNT I VV I EHNLDV I KTADW I VDLGPEGGSGGGE 
I LV5GTPETVAECEASHTARFLKPML" 
promoter compl ement (4901. .4927) 

/note=" factor Sigma70; promoter uvrA; documented +1 at 
4271488" 

protein bind complement (4909. .4929) 

/note=" cent ral position to uvrA promoter: -10. 5" 

/bound_moiety= n LexA documented site" 
protein bind 4934. . 4954 

/note="central position to ssb promoter: -45.5" 

/bound_moiety="LexA documented site" 
pr omot er 4948. . 4978 

/note=" factor Sigma70; promoter ssb; documented +1 

at4271590" 
promoter 5026. . 5055 

/note=" factor Sigma70; promoter ssbp2; documented +1 

at4271663" 
promoter 5041. . 5069 

/note=" factor Sigma70; promoter ssbp3; documented +1 

at4271674" 
gene 5104. . 5640 

/gene=" ssb" 

/note=" synonym: b4059" 
CDS 5104. .5640 

/gene=" ssb" 

/function=" factor; DNA- replication, repair, 
res t r i ct i on/mod i f i cat ion" 
/no.te="o!78; CG Site No. 150" 
/codon_start=1 
/transl_table=1 1 

/product^" ssDNA-binding protei n" 
/protein_id=" AAC77029. 1" 
/db_xref="GI: 1790494" 

/translation^ MASRGVNKV I LVGNLGQDPEVRYMPNGGAVAN I TLATSESWRDK 
ATGEMKEQTEWHR VVLFGKLAEVASEYLRKGSQVY I EGQLRTRKWTDQSGQDRYTTE V 
VVNVGGTMQMLGGRQGGGAPAGGN I GGGQPQGGWGQPQQPQGGNQFSGGAQSRPQQSA 
PAAPSNEPPMDFDDD I PF" 
gene compl ement (5739. . 6089) 

/gene="yjcB* 
/note=" synonym: b4060" 
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CDS 



protein_ bind 

protein bind 

promoter 
p romot er 
protein__bind 

protein bind 

promoter 

CDS 



CDS 



complement(5739. . 6089) 
/gene=" yj cB" 

/f unct ion=" orf ; Unknown" 

/note=" f 1 16 ; 100 pet identical amino acid sequence and 
equal length to YJCB_ECOLI SW: P32700" 
/codon_start=1 
/ 1 r ans l_t ab I e=l 1 

/product="orf, hypothetical protein* 1 
/protein_id=" AAC77030. 1" 
/db_xref="GI : 1790495" 

/trans I a t i on=" MYHASREQRRAVQG I GSFQRNCLMATLTTGVVLLRWQLLSAVMM 
FLASTLN I RFRRSDYVGLAV I SSGLGVVSACWFAMGLLG ITMAD I TA I WHN I ESVMI E 
EMNQTPPQWPMILT" 
6146. .6174 

/note=" cent ra I position to predicted promoter: -152.5" 
/bound_moiety="GalR predicted site" 
6146. .6174 

/note="central position to predicted promoter: -215.5" 
/bound_moiety="GalR predicted site" 
complement (6156. .6185) 

/note=" factor Sigma70; predicted +1 start at 4272749" 
6280.. 6307 

/note="factor Sigma70; predicted +1 start at 4272914" 
6323.. 6345 

/note-' cent ra I position to predicted promoter: -41.5" 
/bound_moiety="Fur predicted site" 
6323.. 6345 

/note="centrai position to predicted promoter:21. 5" 
/bound_moiety="Fur predicted site" 
6339.. 6370 

/note=" factor Sigma70; predicted +1 start at 4272977" 

6450. .8036 ^ 

/gene=" yjcC" 

/note=" synonym: b4061" 

6450. .8036 

/gene-" yj cC" 

/f unct ion=" orf ; Unknown" 

/note="o528; 100 pet identical to YJCC_ECOLI SW: 

P32701 ;simi lar to Azorhizobium caulinodans hypoth. 

protein, ntrC 3* region" 

/codon_start=1 

/transl_table=11 

/product="orf , hypothetical protein" 
/protein_id="AAC77031. 1" 
/db_xref="GI: 1790496" 

/trans I at i on="MSHRARHQLLALPGI I FLVLFPI I LSLWI AFLWAKSEVNNQLRT 

FAQLALDKSEL V I RQADLVSDAAER YQGQVCTPAHQKRMLN I i RGYLY INELI YARDN 

HFLCSSL I APVNGYT I APADYKREPNVS I YYYRDTPFFSGYKMTYMQRGNYVAV I NPL 

FWSEVMSDDPTLQWGVYDTVTKTFFSLSKEASAATFSPLIHLKDLTVQRNGYLYATVY 

STKRP I AA I VATSYQRL I THFYNHL I FALPAG I LGSLVLLLLWLR I RQNYLSPKRKLQ 

RALEKHQLCLYYQP I ID! KTEKC I GAEALLRWPGEQGQ I MNPAEF I PLAEKEGM I EQ I 

TDYV I DNVFRDLGDYLATHADRYVS I NLSASDFHTSRL I AR I NQKTEQYAVRPQQ I KF 

EVTEHAFLDVDKMTP 1 1 LAFRQAGYEVA I DDFG I GYSNLHNLKSLNVD I LK I DKSFVE 

TLTTHKTSHL I AEH 1 1 ELAHSLGLKT I AEGVETEEQVNWLRKRGVRYCQGWFFAKAMP 

PQVFMQWMEQLPARELTRGQ" 

complement (8039. .8362) 

/gene-" soxS" 

/note=" synonym: b4062" 

compl ement (8039. . 8362) 

/gene-" soxS" 

/function=" regulator; Global regulatory functions" 
/note=" f 107 ; 100 pet identical amino acid sequence and 
equal length to SOXSJCOLI SW: P22539" 
/codon_start=1 
/trans!_table=11 

/product=" regulation of superoxide response regulon" 
/protein_id="AAC77032. 1" 
/db_xref="GI: 1790497" 

/ 1 r ans I a t i o n=" MSHQK 1 1 QDL I AW I DEH I DQPLN I DVVAKKSGYSKWYLQRMFRT 
VTHQTLGDY I RQRRLLLAAVELRTTERPI FD I AMDLGYVSQQTFSRVFRRQFDRTPSD 
YRHRL" 
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promoter 
promoter 
protein bind 

gene 

CDS 



promoter 
gene 

CDS 



promoter 
promoter 



2605 



BASE COUNT 
ORIGIN 

1 tccatcgcga 

61 catctttata 

121 agcagataaa 

181 aaattaatgt 

241 tatttactgg 

301 cctttccccg 

361 agtttcaaca 

421 ccgtttgctt 

481 cgcttaaccc 

541 cggtcgcaca 

601 tcgatgacac 

661 cagaaagcga 

721 atgaattcag 

781 gtgacgcgat 

841 cgctggcgga 

901 ataaaccagg 

961 atggcgattc 



8388.. 8416 

/note=" factor Sigma70; predicted +1 start at 4275023" 
complement (8409. .8438) 

/note=" factor Sigma70; predicted +1 start at 4275002" 
complement (8442.. 8471) 

/note=" cent ra I position to soxS promoter: -26" 

/boundjno iety=" SoxS documented site" 

8448. .8912 

/gene=" soxR" 

/note=" synonym: b4063" 

8448.. 8912 

/gene=" soxR" 

/funct ion=" regulator; Global regulatory functions" 
/note="o154; 100 pet identical amino acid sequence and 
equal length to S0XR_EC0LI SW: P22538" 
/codon_star t=1 
/trans l_tab I e=1 1 

/product^" redox-sens ing activator of soxS" 
/protein_id="AAC77033. T 
/db_xref="GI: 1790498" 

/trans I at i on=" MEKKLPR I KALLTPGEVAKRSGVAVSALHFYESKGL I TS I RNSG 
NQRRYKRDVLRYVA I I K I AQR I G I PLAT I GEAFGVLPEGHTLSAKEWKQLSSQWREEL 
DRR I HTLVALRDELDGC I GCGCLSRSDCPLRNPGDRLGEEGTGARLLEDEQN" 
9261.. 9290 

/note=" factor Sigma70; predicted +1 start at 4275897" 

9458. .10807^ 

/gene^yjcD" 

/note=" synonym: b4064" 

9458. -10807^ 

/gene=" yj cD" 

/funct ion=" orf ; Unknown" 

/note="o449; 100 pet identical to YJCD.ECOLI 
P32702;matches PS00017: ATP/GTP-bi nd i ng site 
similar to two unidentified ORFsfrom the E. coli bit ram 
genomic region [L10328] : f470 and f445" 
/codon_star t=1 
/trans l_table=1 1 

/product= M orf , hypothetical protein" 
/protein_id="AAC77034. 1" 
/db_xref= M GI : 1790499" 

/translation^ MSTPSARTGGSLDAWFK I SQRGSTVRQEVVAGLTTFLAMVYSV I 
VVPGMLGKAGFPPAAVFVATCLVAGLGS I VMGLWANLPLA I GCA I SLTAFTAFSL VLG 
QH I SVPVALGAVFLMGVLFTV ISATGI RSW I LRNLPHGVAHGTG I G 1 GLFLLL I AANG 
VGLV I KNPLDGLPVALGDFATFPV I MS LVGLAV 1 1 GLEKLK VPGG I LLT I I G I S I VGL 
I FDPNVHFSGVFAMPSLSDENGNSL I GSLD I MGALNPVVLPSVLALVMTAVFDATGT I 
RAVAGQANLLDKDGQI IDGGKALTTDSMSSVFSGLVGAAPAAVYIESAAGTAAGGKTG 
LTA I TVGVLFLL I LFLSPLSYLVPGYATAPALMYVGLLMLSNVAK I DFADFVDAMAGL 
VTAVF I VLTCN I VTGtMl GFATLV I GRLVSGEWRKLN I GTVV I AVALVTFYAGGWA I " 
10780. .10809 

/note=" factor Sigma70; predicted +1 start at 4277416" 
10890. .10919 

/note="factor Sigma70; predicted +1 start at 4277526" 
a 2691 c 2829 g 2809 t 



SW: 

motif A; 
col i 82 



gtaataaaat 
aacttcgatt 
tgttctatca 
atccttacat 
gt taaatata 
gcagctggcg 
tactgactat 
at tgt tege t 
tgggactaac 
aat tgaaaat 
ggtacttttt 
agattatctg 
cat tccaaaa 
cttctttgtg 
taattttcat 
gcaaaataca 
tgataatgat 



taatcaccat 
gtttttgtaa 
aatttcgetc 
cgagtaataa 
atcatcctgc 
ttatggtcag 
t tagggaaaa 
ctaaacagt t 
gt tgecagge 
agcctcgcag 
tccagtccgg 
aaaaatcc tg 
gaggtege tc 
actggtcgta 
attcctgcca 
aaatcgcaat 
at taccgccg 



tgtagggtag 
tgctgtatca 
atttgecgag 
acatttttta 
ttttcatcac 
atggtttttg 
atatgegcaa 
ccgctgt tgc 
t tgctgaaca 
ggcgtccgcc 
gcttctggcg 
tgttctggga 
gecagctgat 
gcccgacgaa 
ccaacatgaa 
ggctgcagga 
cacgegatgt 



ggggctggtc 
ttaagttcat 
gat tcatcat 
tacaaaaaaa 
aaaaaccgea 
caacaaatct 
gatcacacag 
cctggcctca 
ggcacccatt 
aatggcggtg 
eggcaaaaaa 
aaaaatgaac 
tgatatgeat 
aacagaaacg 
teeggtgate 
taaaaatatc 
cggcgctcgt 



aatcagaaat 
taaategtae 
aataaacgta 
gacaggaacg 
gataatcctt 
cacaataaaa 
gcaatcagtg 
tctccttcac 
cattgggttt 
gggtttgata 
accttctcgc 
aatggctggg 
gtacgccgcg 
gtttcaaaaa 
tttgcgggcg 
cgaatttttt 
ggtatccgea 
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1021 ttctgcgcgc ctccaactct acctacaaac ccttgccaca agcgggtgcg tttggtgaag 

1081 aggtgatcgt caattcagaa tactgacaga gcgggagagc gtgatgctct cccgaatgct 

1141 gtttttttaa tcacaccttt atcctttcgc tgtcttgctg caaactgatt aagagagttt 

1201 tatcaaggag cagcacatgt ggtatcaaaa gacgctcacg cttagcgcca aatctcgtgg 

1261 gtttcatctg gtaacggatg aaattctgaa tcagctggct gatatgccgc gcgttaacat 

1321 cggcttactg catctgttgc tgcaacatac ctccgcctct ctgacactta atgagaactg 

1381 cgatcccacc gtacgccacg acatggagcg ttttttcctc cgcaccgttc ccgacaacgg 

1441 aaattatgag catgactatg agggagcaga cgatatgcct tctcatatca aatcctcaat 

1501 gctgggaaca tcgcttgtat tgccggtgca taaagggcgt attcagaccg gcacctggca 

1561 aggcatttgg ctgggggaac atcgcatcca cggcggatcg cgtcgcatca tcgcgacact 

1621 acaaggggag taaaaaatga ccatttcgga gttgctacaa tattgcatgg caaaaccagg 

1681 cgcagaacag agcgtgcata atgactggaa agcgacgcag atcaaagtgg aagatgtact 

1741 gtttgcgatg gtgaaagaag tagaaaatcg cccagctgtt tcgctgaaaa ccagcccgga 

1801 gctggcggag ctgctacgtc agcagcacag cgatgtgcgt ccaagccgcc atctgaataa 

1861 agcgcactgg agcaccgtgt atctcgacgg ttcgctgcca gattcgcaaa tctattatct 

1921 ggtggatgcg tcttatcagc aggcggtgaa tttactgccg gaagaaaaac gtaaattgct 

1981 ggtgcaactc tgaaaggaaa aggccgctca gaaagcggcc ttaacgatta cagcatcggc 

2041 ttaaggaagc gtgccgtgtg tgatgcttcg cactccgcga cggtttctgg cgtaccggag 

2101 acgaggatct cgccgccacc actgccgcct tctggtccca ggtcgacaat ccagtcagcg 

2161 gttttgatca cgtcgagatt gtgctcaatc accacaatgg tgttgccctg atcgcgcagt 

2221 ttatgcagta cgtcgagcag ttgctgaata tcggcgaagt gcagaccggt ggtcggctcg 

2281 tcgagaatat acagcgtctg cccggtgccg cgttttgaca gttcacgcgc cagcttcacg 

2341 cgctgggctt caccgcctga aagggtggtt gcggactgcc ccagtcgaat gtacgtcagg 

2401 ccaacgtcca tcaacgtttg cagcttacgc gccagtgcag gtacggcatc aaagaactca 

2461 cgcgcctctt cgatggtcat atccagcact tcgtggatgg ttttgccttt gtacttaatc 

2521 tccagcgttt cacggttata gcgtttacct ttgcactggt cgcacggcac gtagatatcc 

2581 ggcaggaagt gcatctccac tttgatcacg ccatcgccct gacaggcctc gcagcgtccg 

2641 ccacgaacgt taaagctgaa acgtcccggc gtatagccgc gcgcacggga ttccggtacg 

2701 cccgcaaaca gttcgcgcac aggcgtaaac acgccggtat aggtcgccgg gttagaacgt 

2761 ggagtacgac caattgggct ttggtcgata tcgatcactt tatcgaaatg ctccagcccc 

2821 tgaatatcgc gatacggtgc tggttcggcg atggtcgccc cattcaactg gcgttgggca 

2881 atcgggaaca gtgtgtcgtt aatcagcgtc gatttaccgg aacctgaaac cccggtgatg 

2941 caggtaaaca gacccaccgg cagcgtcagc gtcacgtcct tcaggttgtt gccgcgtgcg 

3001 cctgtcagct tcagcacttt ttccggattc gccggaacgc gtttcttcgg cacttcaatc 

3061 ttgcgtttgc cgctcatgta ctgcccggtc aacgactccg gcaccgccat aatcgcttcc 

3121 agcggacctt ctgcgaccac ttcaccgccg tgaacacctg cgcccgggcc aatgtcgatc 

3181 acatggtcag cggcgcgaat tgcgtcttcg tcgtgctcca ccacaatcac ggtattaccg 

3241 agatcgcgca gatggataag cgtacccaac aggcgctcgt tatcacgctg gtgcaggccg 

3301 atagacggct cgtccagcac gtacataacg ccaaccaggc ccgcaccaat ctggctcgcc 

3361 agacggatac gctgtgcttc accgccagaa agcgtttctg ccgagcggga aagcgtcagg 

3421 taattcaggc cgacgttaac gaggaatttc agacgatcgc cgatctcttt aaggattttt 

3481 tctgcaatct tcgcccgctg acctgcgagt ttgagattgt tgaagaattc catcgcatga 

3541 ccaatgctca tgtcggagat agcaggcagc ggcgtattct cgacatacac gtggcgcgct 

3601 tcccgacgca gacgcgtccc ttcgcagctg gcgcacggac gattactgat aaacttggct 

3661 aattcttcgc gtaccgcgct ggattccgtc tctttatagc ggcgctccat attatgcagc 

3721 acgccttcga acggatgacg acgaatggag gtatcgccac gatcgttcat gtatttgaat 

3781 tcaatgtttt ctttgccaga accgtacaac accactttat gcacgttcgc gctcaggctg 

3841 ccccacggcg cttcgacgtc gaacttatag tgatctgcca gcgatttcag catctggaaa 

3901 taatagaagt tgcggcgatc ccagccacgg atcgcaccac cagccagcga cagttccgga 

3961 ttctggatca ctcgatcagg atcgaaatat tgctgtacgc c.aaggccgtc gcaggtcggg 

4021 caggcccccg ccgggttgtt aaacgaaaac agtcgcggct ccagttcacg catactgtag 

4081 ccgcaaattg ggcaggcgaa gttggcggag aacagcagct cttccgcttt cgggtcgtcc 

4141 atatccgcca ctaccgcggt accaccggaa agctccagcg cggtttcaaa tgactcggca 

4201 agacgttggg taagatcgtc acgcaccttg aagcgatcaa ccaccacttc aatggtatgt 

4261 ttcttttgca gttccagttt tggcggatcg gaaagatcgc agacttcgcc atcaatacga 

4321 gcacggatgt agccctggct tgccaggttc tccagcgttt tggtgtgttc gcctttgcgc 

4381 tctttaatga ttggcgcgag tagcatcaga cgcttgcctt ccggctgcga cagcacgtta 

4441 tccaccatct ggctgacggt ttgcgccgcc agcgggacgt cgtggtccgg acagcgcggc 

4501 tcgccaacgc gggcgaataa caaacgcaaa tagtcgtgga tttcggtgat tgtccccacc 

4561 gtagaacgcg ggttatgaga cgtcgatttc tgctcaattg agatggcagg agaaagcccc 

4621 tcaatatgat cgacgtccgg cttttccatc agtgacagaa actgccgcgc gtaggcggaa 

4681 agggattcaa cgtaacggcg ctgcccttcg gcatataagg tgtcgaaagc gagcgaggat 

4741 ttgccagaac ccgaaagccc ggtcacgaca atgagcttgt cgcgggggat aacgaggttg 

4801 atgtttttga gattatgggt gcgggcgccc cgaacttcga tcttatccat tcacctttcc 

4861 cggattaaac gcttttttgc ccggtggcat ggtgctaccg gcgatcacaa acggttaatt 

4921 atgacacaaa ttgacctgaa tgaatataca gtattggaat gcattacccg gagtgttgtg 

4981 taacaatgtc tggccaggtt tgtttcccgg aaccgaggtc acaacatagt aaaagcgcta 

5041 ttggtaatgg tacaatcgcg cgtttacact tattcagaac gatttttttc aggagacacg 

5101 aacatggcca gcagaggcgt aaacaaggtt attctcgttg gtaatctggg tcaggacccg 

5161 gaagtacgct acatgccaaa tggtggcgca gttgccaaca ttacgctggc tacttccgaa 

5221 tcctggcgtg ataaagcgac cggcgagatg aaagaacaga ctgaatggca ccgcgttgtg 
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5281 ctgttcggca aactggcaga agtggcgagc gaatatctgc gtaaaggttc tcaggtttat 

5341 atcgaaggtc agctgcgtac ccgtaaatgg accgatcaat ccggtcagga tcgctacacc 

5401 acagaagtcg tggtgaacgt tggcggcacc atgcagatgc tgggtggtcg tcagggtggt 

5461 ggcgctccgg caggtggcaa tatcggtggt ggtcagccgc agggcggttg gggtcagcct 

5521 cagcagccgc agggtggcaa tcagttcagc ggcggcgcgc agtctcgccc gcagcagtcc 

5581 gctccggcag cgccgtctaa cgagccgccg atggactttg atgatgacat tccgttctga 

5641 tttgtcatta aaacaatagg ttatattgtt ttaaggtgga tgattaaagc atctgccagc 

5701 cataaaaaag aagcctccgt tatggaggct tctacgtatc aggtcaaaat cattggccat 

5761 tgtggcggtg tctgattcat ctcttctatc atcaccgact cgatgttgtg ccagatagcg 

5821 gtgatgtccg ccattgtgat gccaagcaac cccattgcga accaacaggc ggaaacaacg 

5881 cccagaccgc tgctgatcac cgcaagcccg acataatcag accgacgaaa acggatgttg 

5941 agtgtgctgg ccagaaacat cattacggca ctaagaagtt gccagcgaag aagaaccacg 

6001 ccagtggtga gggtagccat caaacaattc ctctgaaaag agccgatgcc ctggacagcg 

6061 cggcgttgtt cacgggaggc gtggtacact ctggctatcg cggggcttgc agaacacaaa 

6121 aatgaaacac tctgtttgtt tcattaattt tgtgaactat atcacaattg attgtttgtt 

6181 agccatatta ggccgtgact tttattgctg tacagattat gtggtttttc agtggcatta 

6241 agggcatatc ttcccgccgc ctctgcattc ctgtaggaaa ttaattttga atatcaatga 

6301 attattttca tccaggtgac gattagaaag gtatcaattt caaatcaggc aaaagtgcta 

6361 tttataccgt aagatttatc taaagacgtc ggtacccagg gttttcacct tgcaatggcc 

6421 gggtataaac aggcaggaaa ttgatagcaa tgagtcatcg tgcacgacac caattactgg 

6481 cgttgccggg cattatcttt ttagttctct ttcccatcat tctttcgcta tggattgcct 

6541 tcctttgggc aaaatcagaa gtgaataatc agctccgaac ctttgctcaa ctggcactgg 

6601 ataaatccga gctggtcatt cgccaggcag atttagtgag cgatgcagct gaacgctatc 

6661 aggggcaagt ttgcactcca gcccatcaaa agcgaatgtt gaatattatt cgtggctatc 

6721 tttatattaa tgaattgatc tatgcccgtg ataaccattt tttatgctca tcgctgatag 

6781 cgcctgtaaa cggctatacg attgcaccgg ccgattataa gcgtgaacct aacgtttcta 

6841 tctattatta ccgcgatacg ccttttttct ctggctataa aatgacctat atgcagcggg 

6901 gaaattatgt ggcggttatc aaccctctct tctggagtga agtgatgtct gatgacccga 

6961 cattgcaatg gggtgtgtat gatacggtga cgaaaacctt tttctcgtta agcaaagagg 

7021 cctcggcagc aacgttttct ccgctgattc atttgaagga tttaaccgta caaagaaatg 

7081 gctatttata tgcgacagtt tattcgacaa aacgcccaat tgcagccatt gttgcgactt 

7141 catatcaacg tcttataacc catttttata atcatcttat ttttgcgttg cccgccggta 

7201 ttttggggag tcttgttctg ctattactct ggctacgtat tcgacaaaac tatttatctc 

7261 ccaaacgtaa attgcaacgc gccctcgaaa aacatcaact ttgtctttat taccagccaa 

7321 taatcgatat caaaacagaa aaatgtatcg gcgctgaagc gttgttacgt tggcctggtg 

7381 agcaggggca aataatgaat ccggcagagt ttattccgct ggcagaaaag gaggggatga 

7441 tagaacagat aactgattat gttattgata atgtcttccg cgatctgggc gattacctgg 

7501 caacacatgc agatcgctat gtttctatta acctgtcggc ctccgatttt catacgtcac 

7561 ggttgatagc gcgaatcaat cagaaaacag agcaatacgc ggtgcgtccg cagcaaatta 

7621 aatttgaagt gactgagcat gcatttcttg atgttgacaa aatgacgccg attattctgg 

7681 ctttccgcca ggcaggttac gaagtggcaa ttgatgattt tggtattggc tactctaact 

7741 tgcataacct taaatcattg aatgtcgata ttttgaaaat cgacaaatcg tttgttgaaa 

7801 cgctgaccac ccacaaaacc agtcatttga ttgcggaaca catcatcgag ctggcgcaca 

7861 gcctggggtt aaaaacgatc gctgaaggcg tcgaaactga ggagcaggtt aactggctgc 

7921 gcaaacgcgg cgtgcgctat tgccagggat ggttctttgc gaaggcgatg ccgccgcagg 

7981 tgtttatgca atggatggag caattacccg cgcgggagtt aacgcgcggg caataaaatt 

8041 acaggcggtg gcgataatcg ctgggagtgc gatcaaactg ccgacggaaa acgcgggaga 

8101 aggtctgctg cgagacataa cccaggtcca ttgcgatatc aaaaatcgga cgctcggtgg 

8161 tgcgcaactc aacggcggcc agtaacaggc ggcgttggcg aatgtaatcg ccaagcgtct 

8221 gatgcgtcac cgtgcggaac attcgttgca agtaccactt tgaatagcct gatttttttg 

8281 cgactacatc aatgttaagc ggctggtcaa tatgctcgtc aatccatgcg ataagatcct 

8341 gaataatttt ctgatgggac ataaatctgc ctcttttcag tgttcagttc gttaattcat 

8401 ctgttgggga gtataattcc tcaagttaac ttgaggtaaa gcgatttatg gaaaagaaat 

8461 taccccgcat taaagcgctg ctaacccccg gcgaagtggc gaaacgcagc ggtgtggcgg 

8521 tatcggcgct gcatttctat gaaagtaaag ggttgattac cagtatccgt aacagcggca 

8581 atcagcggcg atataaacgt gatgtgttgc gatatgttgc aattatcaaa attgctcagc 

8641 gtattggcat tccgctggcg accattggtg aagcgtttgg cgtgttgccc gaagggcata 

8701 cgttaagtgc gaaagagtgg aaacagcttt cgtcccaatg gcgagaagag ttggatcggc 

8761 gcattcatac cttagtggcg ctgcgtgacg aactggacgg atgtattggt tgtggctgcc 

8821 tttcgcgcag tgattgcccg ttgcgtaacc cgggcgaccg cttaggagaa gaaggtaccg 

8881 gcgcacgctt gctggaagat gaacaaaact aaagcgccac aagggcgctt tagtttgttt 

8941 tccggtcttt gtctttctct ctatcccgct ggtacacagg agggtttccc ccgacgtcaa 

9001 cacacctcat tcgagcacgt ggtggaggtt ccggttggtg ttgatgcttt aattgtatgt 

9061 caccgacgtt tcttcgccag tgtaaaagta tactttttaa ccgcaatatt tttgtcatct 

9121 cagacgattt tttatcgcaa tcctgaacgg tatacggctc gataacgctg caatcttgcg 

9181 caccgacgat aacgtttgcg catcaattgc ctggtttttc atcgtcaaga caataaaaga 

9241 gaaaaaagca gcaaacttcg gttgaaaaag ccgctatgat cgccggataa tcgtttgctt 

9301 tttttaccac ccgttttgta tgcgcggagc taaacgtttg cttttttgcg acgcagcaaa 

9361 ttgtcgcaaa cctggagcag gaagataacg tttcgctggc aggggattgt ccgccacgca 

9421 tcttgacgaa aattaaactc tcaggggatg ttttcttatg tctacgccat cagcgcgtac 

9481 cggcggttca ctcgacgcct ggtttaaaat ttcacaacgt ggaagcactg tccgtcagga 
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ctgatgctga 
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ctggttacgg 
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catgatcggc 
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aeggtegt ta 


10801 


tatctaatct 
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10861 


ccgcacat tg 
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cgtt tctggc 
tcccgcctgc 
tgggtctgtg 
ccgcattcag 
tcctgatggg 
gcaac t tgcc 
tcattgccgc 
cgctgggtga 
tcggcctgga 
ttgtcggttt 
tgagcgatga 
ctgtagtcct 
ctatccgtgc 
gtgggaaagc 
ctccggcagc 
tgaeggctat 
acctcgt tec 
gcaacgtggc 
ctgtat teat 
tggtgat tgg 
tcgccgtggc 
tctgaaaacg 
tgegatat tc 



gatggtctac 
ggcagttttc 
ggc taatctg 
cctggtgctg 
tgtgctgttt 
tcacggtgtg 
taacggtgtc 
t ttcgcgacc 
aaaactgaaa 
gatcttcgat 
aaacggcaat 
gecaagegt t 
cgtcgccggc 
actgaccact 
ggtatacatc 
caccgt tggc 
ggggtatgea 
gaaaatcgac 
cgtgctgacc 
tegtc tggt t 
gctggtgacc 
ggtggcaatg 
tgaaaaaaat 



teggtcateg 
gt tgeaaect 
ccgt tggega 
gggcaacata 
aeggtaat 1 1 
gcgcacggca 
ggtctggtga 
ttcccggtga 
gtccctggtg 
cctaacgtcc 
tcac tgattg 
ctggcgctgg 
caggcgaacc 
gactccatga 
gagtctgegg 
gtgctgt tec 
acggctccgg 
tttgetgatt 
tgtaacatcg 
teeggegaat 
ttctatgegg 
gctgcccgt t 
gagaat tcag 



tegt tccagg 
gtctggt tgc 
t tggttgcgc 
t tagegtace 
ctgccacggg 
eggggattgg 
ttaaaaaccc 
ttatgtcact 
geattctget 
atttctcegg 
gcagcctgga 
tgatgaegge 
tgctggataa 
gcagcgtttt 
egggtaegge 
tcctgattct 
cgctgatgta 
ttgttgatgc 
taacaggcat 
ggcgcaagt t 
gtggctgggc 
tttattttct 
geataaegtc 
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